Goto

Collaborating Authors

 lead time




OceanForecastBench: A Benchmark Dataset for Data-Driven Global Ocean Forecasting

Jia, Haoming, Han, Yi, Wang, Xiang, Wang, Huizan, Wu, Wei, Zheng, Jianming, Xiao, Peikun

arXiv.org Machine Learning

Global ocean forecasting aims to predict key ocean variables such as temperature, salinity, and currents, which is essential for understanding and describing oceanic phenomena. In recent years, data-driven deep learning-based ocean forecast models, such as XiHe, WenHai, LangYa and AI-GOMS, have demonstrated significant potential in capturing complex ocean dynamics and improving forecasting efficiency. Despite these advancements, the absence of open-source, standardized benchmarks has led to inconsistent data usage and evaluation methods. This gap hinders efficient model development, impedes fair performance comparison, and constrains interdisciplinary collaboration. To address this challenge, we propose OceanForecastBench, a benchmark offering three core contributions: (1) A high-quality global ocean reanalysis data over 28 years for model training, including 4 ocean variables across 23 depth levels and 4 sea surface variables. (2) A high-reliability satellite and in-situ observations for model evaluation, covering approximately 100 million locations in the global ocean. (3) An evaluation pipeline and a comprehensive benchmark with 6 typical baseline models, leveraging observations to evaluate model performance from multiple perspectives. OceanForecastBench represents the most comprehensive benchmarking framework currently available for data-driven ocean forecasting, offering an open-source platform for model development, evaluation, and comparison. The dataset and code are publicly available at: https://github.com/Ocean-Intelligent-Forecasting/OceanForecastBench.


A Diffusion-Based Framework for High-Resolution Precipitation Forecasting over CONUS

Vicens-Miquel, Marina, McGovern, Amy, Hill, Aaron J., Foufoula-Georgiou, Efi, Guilloteau, Clement, Shen, Samuel S. P.

arXiv.org Artificial Intelligence

Accurate precipitation forecasting is essential for hydrometeorological risk management, especially for anticipating extreme rainfall that can lead to flash flooding and infrastructure damage. This study introduces a diffusion-based deep learning (DL) framework that systematically compares three residual prediction strategies differing only in their input sources: (1) a fully data-driven model using only past observations from the Multi-Radar Multi-Sensor (MRMS) system, (2) a corrective model using only forecasts from the High-Resolution Rapid Refresh (HRRR) numerical weather prediction system, and (3) a hybrid model integrating both MRMS and selected HRRR forecast variables. By evaluating these approaches under a unified setup, we provide a clearer understanding of how each data source contributes to predictive skill over the Continental United States (CONUS). Forecasts are produced at 1-km spatial resolution, beginning with direct 1-hour predictions and extending to 12 hours using autoregressive rollouts. Performance is evaluated using both CONUS-wide and region-specific metrics that assess overall performance and skill at extreme rainfall thresholds. Across all lead times, our DL framework consistently outperforms the HRRR baseline in pixel-wise and spatiostatistical metrics. The hybrid model performs best at the shortest lead time, while the HRRR-corrective model outperforms others at longer lead times, maintaining high skill through 12 hours. To assess reliability, we incorporate calibrated uncertainty quantification tailored to the residual learning setup. These gains, particularly at longer lead times, are critical for emergency preparedness, where modest increases in forecast horizon can improve decision-making. This work advances DL-based precipitation forecasting by enhancing predictive skill, reliability, and applicability across regions.


FuXi-Nowcast: Meet the longstanding challenge of convective initiation in nowcasting

Chen, Lei, Zhu, Zijian, Zhuang, Xiaoran, Qi, Tianyuan, Feng, Yuxuan, Zhong, Xiaohui, Li, Hao

arXiv.org Artificial Intelligence

Accurate nowcasting of convective storms remains a major challenge for operational forecasting, particularly for convective initiation and the evolution of high-impact rainfall and strong winds. Here we present FuXi-Nowcast, a deep-learning system that jointly predicts composite radar reflectivity, surface precipitation, near-surface temperature, wind speed and wind gusts at 1-km resolution over eastern China. FuXi-Nowcast integrates multi-source observations, such as radar, surface stations and the High-Resolution Land Data Assimilation System (HRLDAS), with three-dimensional atmospheric fields from the machine-learning weather model FuXi-2.0 within a multi-task Swin-Transformer architecture. A convective signal enhancement module and distribution-aware hybrid loss functions are designed to preserve intense convective structures and mitigate the rapid intensity decay common in deep-learning nowcasts. FuXi-Nowcast surpasses the operational CMA-MESO 3-km numerical model in Critical Success Index for reflectivity, precipitation and wind gusts across thresholds and lead times up to 12 h, with the largest gains for heavy rainfall. Case studies further show that FuXi-Nowcast more accurately captures the timing, location and structure of convective initiation and subsequent evolution of convection. These results demonstrate that coupling three-dimensional machine-learning forecasts with high-resolution observations can provide multi-hazard, long-lead nowcasts that outperforms current operational systems.


FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching

Ribeiro, Bernardo Perrone, Pucer, Jana Faganeli

arXiv.org Artificial Intelligence

Radar-based precipitation nowcasting, the task of forecasting short-term precipitation fields from previous radar images, is a critical problem for flood risk management and decision-making. While deep learning has substantially advanced this field, two challenges remain fundamental: the uncertainty of atmospheric dynamics and the efficient modeling of high-dimensional data. Diffusion models have shown strong promise by producing sharp, reliable forecasts, but their iterative sampling process is computationally prohibitive for time-critical applications. We introduce FlowCast, the first end-to-end probabilistic model leveraging Conditional Flow Matching (CFM) as a direct noise-to-data generative framework for precipitation nowcasting. Unlike hybrid approaches, FlowCast learns a direct noise-to-data mapping in a compressed latent space, enabling rapid, high-fidelity sample generation. Our experiments demonstrate that FlowCast establishes a new state-of-the-art in probabilistic performance while also exceeding deterministic baselines in predictive accuracy. A direct comparison further reveals the CFM objective is both more accurate and significantly more efficient than a diffusion objective on the same architecture, maintaining high performance with significantly fewer sampling steps. This work positions CFM as a powerful and practical alternative for high-dimensional spatiotemporal forecasting.


XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge

Wang, Wuxin, Ni, Weicheng, Huang, Lilan, Hao, Tao, Fei, Ben, Ma, Shuo, Yuan, Taikang, Zhao, Yanlai, Deng, Kefeng, Li, Xiaoyong, Leng, Hongze, Duan, Boheng, Bai, Lei, Zhang, Weimin, Ren, Kaijun, Song, Junqiang

arXiv.org Artificial Intelligence

Artificial intelligence (AI)-driven models have the potential to revolutionize weather forecasting, but still rely on initial conditions generated by costly Numerical Weather Prediction (NWP) systems. Although recent end-to-end forecasting models attempt to bypass NWP systems, these methods lack scalable assimilation of new types of observational data. Here, we introduce XiChen, an observation-scalable fully AI-driven global weather forecasting system, wherein the entire pipeline, from Data Assimilation (DA) to medium-range forecasting, can be accomplished within only 15 seconds. XiChen is built upon a foundation model that is pre-trained for weather forecasting and subsequently fine-tuned to serve as both observation operators and DA models, thereby enabling the scalable assimilation of conventional and raw satellite observations. Furthermore, the integration of Four-Dimensional Variational (4DVar) knowledge ensures XiChen to achieve DA and medium-range forecasting accuracy comparable to operational NWP systems, with skillful forecasting lead time beyond 8.75 days. A key feature of XiChen is its ability to maintain physical balance constraints during DA, enabling observed variables to correct unobserved ones effectively. In single-point perturbation DA experiments, XiChen exhibits flow-dependent characteristics similar to those of traditional 4DVar systems. These results demonstrate that XiChen holds strong potential for fully AI-driven weather forecasting independent of NWP systems.


Probability calibration for precipitation nowcasting

Kurki, Lauri, Cabrera, Yaniel, Karanko, Samu

arXiv.org Artificial Intelligence

Reliable precipitation nowcasting is critical for weather-sensitive decision-making, yet neural weather models (NWMs) can produce poorly calibrated probabilistic forecasts. Standard calibration metrics such as the expected calibration error (ECE) fail to capture miscalibration across precipitation thresholds. We introduce the expected thresholded calibration error (ETCE), a new metric that better captures miscalibration in ordered classes like precipitation amounts. We extend post-processing techniques from computer vision to the forecasting domain. Our results show that selective scaling with lead time conditioning reduces model miscalibration without reducing the forecast quality.


High-Resolution Probabilistic Data-Driven Weather Modeling with a Stretched-Grid

Nordhagen, Even Marius, Haugen, Håvard Homleid, Salihi, Aram Farhad Shafiq, Ingstad, Magnus Sikora, Nipen, Thomas Nils, Seierstad, Ivar Ambjørn, Frogner, Inger-Lise, Clare, Mariana, Lang, Simon, Chantry, Matthew, Dueben, Peter, Kristiansen, Jørn

arXiv.org Artificial Intelligence

We present a probabilistic data-driven weather model capable of providing an ensemble of high spatial resolution realizations of 87 variables at arbitrary forecast length and ensemble size. The model uses a stretched grid, dedicating 2.5 km resolution to a region of interest, and 31 km resolution elsewhere. Based on a stochastic encoder-decoder architecture, the model is trained using a loss function based on the Continuous Ranked Probability Score (CRPS) evaluated point-wise in real and spectral space. The spectral loss components is shown to be necessary to create fields that are spatially coherent. The model is compared to high-resolution operational numerical weather prediction forecasts from the MetCoOp Ensemble Prediction System (MEPS), showing competitive forecasts when evaluated against observations from surface weather stations. The model produced fields that are more spatially coherent than mean squared error based models and CRPS based models without the spectral component in the loss.


Agentic AI Framework for Cloudburst Prediction and Coordinated Response

Syed, Toqeer Ali, Khan, Sohail, Jan, Salman, Ali, Gohar, Nauman, Muhammad, Akarma, Ali, Ali, Ahmad

arXiv.org Artificial Intelligence

The challenge is growing towards extreme and short-duration rainfall events like a cloudburst that are peculiar to the traditional forecasting systems, in which the predictions and the response are taken as two distinct processes. The paper outlines an agentic artificial intelligence system to study atmospheric water-cycle intelligence, which combines sensing, forecasting, downscaling, hydrological modeling and coordinated response into a single, interconnected, priceless, closed-loop system. The framework uses autonomous but cooperative agents that reason, sense, and act throughout the entire event lifecycle, and use the intelligence of weather prediction to become real-time decision intelligence. Comparison of multi-year radar, satellite, and ground-based evaluation of the northern part of Pakistan demonstrates that the multi-agent configuration enhances forecast reliability, critical success index and warning lead time compared to the baseline models. Population reach was maximised, and errors during evacuation were minimised through communication and routing agents, and adaptive recalibration and transparent auditability were provided by the embedded layer of learning. Collectively, this leads to the conclusion that collaborative AI agents are capable of transforming atmospheric data streams into practicable foresight and provide a platform of scalable adaptive and learning-based climate resilience.